Report: GWAS analysis

.scrolling {
  max-height: 800px;
  overflow-y: auto;
}

1 Project Summary

Parameter Value
Project test-gwas
Pipeline Version v0.2
Date 2021-08-09
Phenotype File test/regenie_pheno_input.pheno.validated.txt
Phenotype p21001_i0
Covariates COV1
Regenie Output p21001_i0.regenie.gz

2 Phenotype Statistics

2.1 Overview

╭──────────────────────────────────────────────── skimpy summary ─────────────────────────────────────────────────╮
│          Data Summary                Data Types                                                                 │
│ ┏━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓ ┏━━━━━━━━━━━━━┳━━━━━━━┓                                                          │
│ ┃ dataframe          Values ┃ ┃ Column Type  Count ┃                                                          │
│ ┡━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩ ┡━━━━━━━━━━━━━╇━━━━━━━┩                                                          │
│ │ Number of rows    │ 63930  │ │ float64     │ 1     │                                                          │
│ │ Number of columns │ 1      │ └─────────────┴───────┘                                                          │
│ └───────────────────┴────────┘                                                                                  │
│                                                     number                                                      │
│ ┏━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━━┳━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━┓  │
│ ┃ column_name          NA     NA %      mean      sd      p0     p25     p75     p100     hist      ┃  │
│ ┡━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━━╇━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━┩  │
│ │ p21001_i0              0       0      27   5.2   13    24    30     68   ▂█▂    │  │
│ └─────────────────────┴───────┴──────────┴──────────┴────────┴───────┴────────┴────────┴─────────┴───────────┘  │
╰────────────────────────────────────────────────────── End ──────────────────────────────────────────────────────╯

2.2 Phenotype distribution

3 Main results

Main results are summarised here by Manhattan and QQ plot.

Annotated SNPs represent the configured top hits (SNPs with LOG10P above 7.3) processed as follows:

  1. the SNP with the highest LOG10P value is selected for each gene
  2. the resulting SNPs are further grouped by 500kb windows and the SNP with the highest LOG10P value is then annotated in the plot with the name of the closest gene

4 Genes on top SNPs

This table summarises the closest genes for configured top hits (SNPs with LOG10P above 7.3).It reports the list of genes overlapped by top hits and for each

  • the number of top hit SNPs in the gene (VARIANTS).
  • the P-value of the most associated SNP (GENE_LOG10P)
  • the minimum SNP-gene distance across top SNPs
CLOSEST_GENE_NAME CLOSEST_GENE_CHROMOSOME CLOSEST_GENE_START CLOSEST_GENE_END GENE_LOG10P VARIANTS MIN_DISTANCE
17 FTO 16 53737875 54155853 29.82500 128 0
37 SEC16B 1 177893091 177953438 19.78290 105 0
15 FAM150B 2 279558 288851 16.37420 7 119862
50 TMEM18 2 667335 677439 12.71700 223 13462
21 MC4R 18 58038564 58040001 12.14110 100 75448
6 CAAP1 9 26840683 26892802 11.88510 14 221728
29 PMAIP1 18 57567180 57571538 11.16240 71 160880
47 SULT1A1 16 28616903 28634946 11.02060 57 0
45 STX1B 16 31000577 31021949 11.02020 6 0
54 ZNF646 16 31085743 31095517 10.89830 2 0
55 ZNF668 16 31072164 31085641 10.86730 10 0
52 VKORC1 16 31102163 31107301 10.82340 2 388
46 STX4 16 31044210 31054296 10.70670 9 0
28 NUPR1 16 28548606 28550495 10.35330 10 172
3 BCKDK 16 31117428 31124110 10.28040 10 0
8 CCDC101 16 28565236 28603111 10.19650 37 0
18 IL27 16 28510683 28523372 10.12930 5 0
48 SULT1A2 16 28603264 28608430 9.98147 12 0
19 KAT8 16 31127075 31142714 9.95059 14 0
32 PRSS8 16 31142756 31147083 9.74864 2 0
31 PRSS36 16 31150246 31161415 9.64629 1 1105
12 DNAJC27 2 25166505 25194963 9.61703 23 0
0 ADCY3 2 25042038 25142708 9.60099 77 0
26 NPIPB6 16 28353876 28374829 9.37402 7 3818
2 ATP2A1 16 28889726 28915830 9.29087 11 0
30 POC5 5 74969949 75013313 9.27994 13 0
13 EIF3C 16 28699879 28747051 9.20802 4 0
33 RABEP2 16 28915742 28947847 9.06204 5 0
20 LAT 16 28996147 29002104 9.06019 1 0
23 MON1A 3 49946302 49967606 9.05754 6 0
9 CD19 16 28943260 28950667 9.02880 2 1971
44 SPNS1 16 28985542 28995869 9.01589 6 0
14 EIF3CL 16 28390900 28415200 8.99549 1 7658
39 SH2B1 16 28857921 28885526 8.99470 2 0
24 MST1R 3 49924435 49941299 8.98547 9 0
5 C6orf106 6 34555065 34664636 8.93545 42 0
25 NFATC2IP 16 28962128 28978418 8.90139 3 0
16 FAM92A1 8 94710789 94743755 8.87419 1 134867
49 TMEM161B 5 87485450 87565293 8.77920 2 129464
1 ANKDD1B 5 74907284 74967671 8.74342 3 0
43 SORCS3 10 106400859 107024993 8.72631 1 524597
51 UHRF1BP1 6 34759857 34850915 8.69119 2 0
36 SBK1 16 28303840 28335170 8.67686 3 2869
7 CAMKV 3 49895421 49907655 8.61833 6 0
41 SNRPC 6 34725183 34741571 8.37954 4 1873
4 BCL7C 16 30844947 30906281 8.34259 3 0
34 RBM5 3 50126341 50156454 8.33358 5 0
53 ZNF629 16 30789778 30798523 8.30771 1 22343
35 RBM6 3 49977440 50137478 8.29983 51 0
27 NPIPB8 16 28648975 28670003 8.29261 3 0
42 SNX11 17 46180719 46200436 8.28665 1 0
11 CTD-2330K9.3 3 49941278 49954370 8.23756 1 0
22 MIB2 1 1550795 1565990 8.22438 6 0
40 SKAP1 17 46210802 46507637 8.17291 2 0
38 SETD1A 16 30968615 30996437 8.09561 2 0
10 CLN3 16 28477983 28506896 8.05179 1 0

5 Top Loci

This table lists top associated loci after clumping using plink. The minimum log10 p-value for index SNPs is 7.3

CHR SNP P N POS KB RANGES
0 16 rs56094641_A_G 1.496000e-30 116 chr16:53797908..53845487 47.580 [FTO]
1 1 rs539515_A_C 1.649000e-20 71 chr1:177793822..177913519 119.698 [SEC16B]
2 16 rs12446228_A_G 1.673000e-17 14 chr16:53797565..53848561 50.997 [FTO]
3 2 rs62106258_T_C 4.225000e-17 7 chr2:408713..466003 57.291 []
4 16 16:53825987_CA_C_CA_C 4.109000e-14 1 chr16:53825987..53825987 0.001 [FTO]
5 2 rs6744653_A_G 1.919000e-13 238 chr2:600575..653874 53.300 [TMEM18]
6 18 rs35614134_A_AC 7.226000e-13 182 chr18:57732418..57914679 182.262 []
7 9 rs11791087_C_G 1.303000e-12 15 chr9:26586297..26618114 31.818 []
8 1 rs531385_T_C 1.540000e-12 23 chr1:177849243..177922471 73.229 [SEC16B]
9 16 rs7187961_T_C 3.350000e-12 1 chr16:53826034..53826034 0.001 [FTO]
10 16 16:28621572_GT_G_GT_G 9.537000e-12 317 chr16:28372911..28871200 498.290 [APOBR,ATP2A1,ATXN2L,CCDC101,CLN3,EIF3C,EIF3CL...
11 16 rs34898535_C_T 9.546000e-12 80 chr16:30820866..31149142 328.277 [BCKDK,BCL7C,CTF1,FBXL19,FBXL19-AS1,HSD3B7,KAT...
12 18 rs17700633_G_A 1.628000e-10 75 chr18:57916904..58017249 100.346 [MC4R]
13 2 rs10865321_T_C 2.415000e-10 173 chr2:25075281..25321768 246.488 [ADCY3,DNAJC27,DNAJC27-AS1,EFR3B]
14 1 1:177890130_CT_C_CT_C 2.525000e-10 3 chr1:177848153..177890130 41.978 [SEC16B]
15 16 rs2466826_G_A 4.226000e-10 33 chr16:28330968..28582849 251.882 [APOBR,CCDC101,CLN3,EIF3C,EIF3CL,IL27,MIR6862-...
16 16 rs11642449_C_T 5.118000e-10 70 chr16:28809610..29001460 191.851 [ATP2A1,ATXN2L,CD19,LAT,LOC100289092,MIR4517,M...
17 5 rs6874626_G_A 5.249000e-10 108 chr5:74934009..75038901 104.893 [ANKDD1B,POC5]
18 3 rs7634084_A_T 7.600000e-10 277 chr3:49734229..50197097 462.869 [AMIGO3,APEH,CAMKV,CDHR4,FAM212A,GMPPB,IP6K1,M...
19 16 16:28988633_TA_T_TA_T 9.641000e-10 1 chr16:28988633..28988633 0.001 [LAT,MIR4517,NFATC2IP,SPNS1]
20 6 rs2744948_A_G 1.160000e-09 451 chr6:34548206..34832661 284.456 [ANKS1A,C6orf106,SNRPC,SPDEF,TAF11,UHRF1BP1]
21 8 rs528627779_C_A 1.336000e-09 1 chr8:94575923..94575923 0.001 [LINC00535]
22 1 rs10913446_T_C 1.540000e-09 29 chr1:177791704..177819166 27.463 []
23 5 rs11424167_T_TA 1.663000e-09 75 chr5:87449984..87840688 390.705 [LINC00461,LOC102546226,TMEM161B,TMEM161B-AS1]
24 10 rs555419279_A_C 1.878000e-09 1 chr10:107549590..107549590 0.001 []
25 1 rs11583928_T_C 2.110000e-09 2 chr1:177922617..177924529 1.913 [SEC16B]
26 1 rs12135579_C_T 2.213000e-09 7 chr1:177770097..177779502 9.406 []
27 1 rs10913468_C_G 2.769000e-09 1 chr1:177913402..177913402 0.001 [SEC16B]
28 17 rs12939514_T_C 5.168000e-09 117 chr17:46039553..46442786 403.234 [CBX1,CDK5RAP3,COPZ2,LOC100506325,MIR152,MIR12...
29 1 rs74892851_C_A 5.965000e-09 29 chr1:1529994..1601052 71.059 [C1orf233,CDK11B,MIB2,MMP23A,MMP23B,SLC35E2B,S...
30 9 rs10757617_C_G 9.523000e-09 16 chr9:26583608..26628485 44.878 []
31 7 rs34982641_G_A 1.360000e-08 50 chr7:75038408..75196531 158.124 [HIP1,NSUN5P1,PMS2P3,POM121C,SPDYE5,TRIM73,TRI...
32 6 6:34708410_GGGGAGTAAGTACAAGGTTGCTAGTCT_G_GGGGA... 1.383000e-08 5 chr6:34580221..34801188 220.968 [C6orf106,SNRPC,UHRF1BP1]
33 19 rs34783010_G_T 1.493000e-08 10 chr19:46179043..46202172 23.130 [FBXO46,GIPR,MIR642A,MIR642B,QPCTL,SNRPD2]
34 9 rs139480803_C_T 1.578000e-08 2 chr9:26494530..26586507 91.978 []
35 12 12:24836332_ATACAT_A_ATACAT_A 1.628000e-08 1 chr12:24836332..24836332 0.001 []
36 18 rs9945307_A_G 1.642000e-08 30 chr18:57730096..57801689 71.594 []
37 2 rs199992729_T_C 1.737000e-08 73 chr2:100637788..100797912 160.125 [AFF3]
38 2 rs2058625_A_T 1.748000e-08 5 chr2:58960544..59042342 81.799 [LINC01122]
39 2 rs12468708_C_A 1.784000e-08 53 chr2:58866584..58999505 132.922 [LINC01122]
40 1 rs144847043_G_A 1.946000e-08 1 chr1:177830321..177830321 0.001 []
41 16 16:28737148_CT_C_CT_C 2.036000e-08 1 chr16:28737148..28737148 0.001 [EIF3C,EIF3CL,MIR6862-1,MIR6862-2]
42 16 16:28864262_CAT_C_CAT_C 2.770000e-08 1 chr16:28864262..28864262 0.001 [ATXN2L,MIR4721,SH2B1,TUFM]
43 6 rs765469261_T_TTC 3.370000e-08 22 chr6:34553048..34828553 275.506 [C6orf106,SNRPC,TAF11,UHRF1BP1]
44 4 rs527434584_A_AT 4.162000e-08 1 chr4:137887773..137887773 0.001 []
45 9 rs187181201_T_C 4.372000e-08 1 chr9:127668244..127668244 0.001 [GOLGA1]
46 1 rs10913483_C_A 4.389000e-08 35 chr1:177938437..177978859 40.423 [LOC730102,SEC16B]
47 16 rs12922786_T_C 4.814000e-08 78 chr16:30150001..30343236 193.236 [BOLA2,BOLA2B,CD2BP2,CORO1A,LOC388242,LOC60672...

5.1 Locus 1 - rs56094641_A_G

5.2 Locus 2 - rs539515_A_C

5.3 Locus 3 - rs12446228_A_G

5.4 Locus 4 - rs62106258_T_C

5.5 Locus 5 - 16:53825987_CA_C_CA_C

6 Validation and Logs

6.1 Phenotype File Validation

Name Value
0 Samples 63930

6.2 Covariate File Validation

Name Value
0 Samples 60724

6.3 Regenie Step 1 Log

Name Value
0 Regenie Version v3.1.3.gz
1 Variants total (*.bim) 417948
2 Number of defined phenotypes 1
3 Phenotyped individuals total 60724
4 Number of defined covariates 21
5 Phenotyped individuals used 60724
6 Regenie Call --step 1 --bed step1_dataset_autosomes_QCe...

6.4 Regenie Step 2 Log

Name Value
0 Regenie Version v3.1.3.gz
1 MAC limit 50
2 Imputation info score limit 0.0
3 Number of defined phenotypes 1
4 Phenotyped individuals total 60608
5 Number of defined covariates 21
6 Phenotyped individuals used 60608
7 Variants ignored (low MAC or low info score) 49064
8 Variants used (\*.bgen or \*.pvar) 42795932
9 Regenie Call --step 2 --bgen step2_dataset_autosomes.ma...

This report has been created with nf-fast-regenie v0.2, a nextflow pipeline developed by Edoardo Giacopuzzi at the Human Technopole Foundation, Milan, Italy. Plots are generated using gwaslab package